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Abstract 

Three  models  for  positive-valued  and  discrete-valued  stationary  time 
series  are  discussed.   All  have  the  property  that  for  a  range  of  specified 
marginal  distributions  the  time  series  have  the  same  correlation  structure 
as  the  usual  linear,  autoregressive-moving  average  (ARMA)  model.   The 
models  differ  in  the  range  of  marginal  distributions  which  can  be  accommo- 
dated and  in  the  simplicity  and  flexibility  of  each  model.   Specifically 
the  EARMA-type  processes  can  be  extended  from  the  exponential  distribution 
to  a  rather  narrow  range  of  continuous  distributions;  the  DARMA-type  pro- 
cesses can  be  defined  usefully  for  any  discrete  marginal  distribution 
and  are  simple  and  flexible.   Finally  the  marginally  controlled  semi- 
Markov  generated  process  can  be  defined  for  any  continuous  or  discrete 
positive-valued  distribution  and  is  therefore  very  flexible.   However 
the  model  suffers  from  some  complexity  and  parametric  obscurity. 


* 
Research  supported  by  National  Science  Foundation  Grant  AF476  and  Office 

of  Naval  Research  Grant  NR-42-284  at  the  Naval  Postgraduate  School. 


1.    Introduction 

In  much  of  the  current  work  on  the  analysis  of  stationary  time  series 
there  is  an  implicit  assumption  that  the  marginal  distribution  of  the  time 
series  is  normal.   The  assumption  is  implicit  in  that  the  marginal  distri- 
bution is  not  considered  to  be  of  interest  per  se  in  the  analysis,  and 
also  in  that  the  statistical  procedures  which  are  used  are  very  definitely 
based  on  normality  assumptions.   The  stationary  model  on  which  much  of 
this  time  series  analysis  is  based  is  the  mixed  autoregressive  moving 
average  process, 

a„X,  +  a,X.  .  +  ...  +  a  X.    =  bn£4  +  b.€.  +  ...  +  b  6.  (1.1) 

0  I    1  l-l  p  i-p    0  i    1  i  q  l-q 

1=0,  +  1,  +  2,  ...  , 

sometimes  called  the  ARMA(p,q)   or  Box-Jenkins  process.   The  process  (1.1) 
is  specified  quite  generally  as  a  linear  combination  of  i.i.d.  random 
variables   {£  }  of  unspecified  distribution,  the  linear,  additive 
structure  determining  the  correlation  structure  of  the  stationary  sequence 
{X.}  under  well-known  restrictions  on  the  parameters.   If  one  wants   {X . } 
to  be  a  time  series  with  normally  distributed  marginal  distribution,  this 
can  be  accomplished  by  taking  the  €.'s   to  be  normally  distributed.   The 
model  is  then  completely  specified. 

There  are,  however,  many  situations  in  which  observations  occur 
serially  and  in  which  the  marginal  distribution  is  patently  non-normal. 
For  example,  data  on  the  number  of  occurrences  of  all  known  diseases  in 
each  week  is  kept  by  the  National  Center  for  Health  Statistics.   The  data 
is  not  only  discrete  count  data,  but  for  many  diseases  it  is  mostly  on 
the  order  of   0,  1,  2,  3,  and  very  seldom  above  this. 


It  has  been  suggested  that  such  non-normal  data  be  handled  by  data 
transformations  and  this  is  probably  appropriate  if  the  data  is  only 
slightly  non-normal.   In  other  cases  it  seems  reasonable  to  start  afresh 
and  develop  models  from  scratch.   In  this  paper  we  summarize  attempts  to 
do  this  for  stationary  time  series  which  are  known  to  be  non-normal  because 
of  either  positivity  or  discreteness  or  both.   The  essence  of  the  models  is 
that  the  marginal  distribution  is  specified,  as  well  as  the  correlation 
structure.   More  generally  the  models  are  required  to  be  simple  and 
flexible  in  the  following  senses: 

a)  The  models  should  be  specified  in  terms  of  easily  observed  and 
measured  quantifiers.   When  the  models  are  stationary,  these  quan- 
tifiers would  typically  be 

i)   the  marginal  distribution,  and 
ii)   second-order  moments  (correlations) . 

b)  The  models  should  be  parametrically  parsimonious  and  hopefully 
parametrically  meaningful. 

c)  The  models  should  be  easy  to  generate  on  computers,  i.e.,  they  should 
be  structurally  simple;  in  fact  it  might  be  preferable  for  the 
models  to  have  linear  structure. 

d)  The  models  should  be  easy  to  fit  to  data,  both  informally  and 
formally. 

The  model  (1.1)  certainly  has  most  of  the  above  features,  but  it  is 
not  known  in  general  how  to  specify  the  distribution  of   £.   so  as  to 
produce  a  given,  continuous  marginal  distribution  for  the  X  's.   More- 
over, it  is  clearly  not  possible  to  do  this  at  all  if  the  X  '  s  are 
discrete  random  variables. 


The  work  described  in  this  paper  on  non-normal  time  series  is  joint 
work  with  D.  P.  Gaver ,  P.  A.  Jacobs  and  A.  J.  Lawrance.   Although  the 
work  has  much  broader  connotation,  it  will  be  described  in  the  context 
in  which  it  arose,  that  of  the  description  of  stochastic  point  processes, 
or  series  of  events  occurring  in  time.   One  way  in  which  these  point 
processes  can  be  described  is  as  a  sequence  of  intervals  between  events 
{X.},  which  are  of  course  positive-valued  random  variables.   In  the 
common  case  of  a  Poisson  point  process  the  X  '  s   have  an  exponential 
distribution.   However,  as  in  the  case  of  epidemics,  point  processes 
are  generally  observed  as  counts  of  events  in  successive  fixed  intervals 
and  these  are  non-negative  discrete  valued  random  variables.   For  the 
Poisson  process  these  counts  are  independent  and  Poisson  distributed  and 
this  serves  as  the  null  model  in  the  analysis  of  count  data  from  point 

processes. 

Three  distinct  models  are  discussed  in  the  context  of  the  analysis 
and  description  of  point  processes.   All  of  them  satisfy  the  requirements 
discussed  above  to  some  degree.   The  EARMA-type  process  described  first 
has  recently  been  extended  to  have  a  complete  ARMA-type  correlation 
structure,  but  the  process  cannot  be  extended  to  all  continuous  marginal 
distributions.   Marginally  controlled  semi-Markov  generated  processes,  on 
the  other  hand,  give  a  complete  analog  to  (1.1),  but  they  do  not  have 
linear  structure.   They  can  also  be  extended  to  give  processes  with 
discrete  marginal  distributions.   A  simpler,  random  linear  structure 
has  been  derived,  however,  which  gives  discrete  processes  with  ARMA 
structure.   These  are  DARMA-type  processes  and  come  closer  than  the 
other  processes  to  fulfilling  the  requirements  of  simplicity  and  flex- 
ibility. 

Further  details  on  the  processes  are  to  be  found  in  the  references. 


2.    Interval  Models:   Sequences  of  continuous  positive-valued  random 

variables 

Univariate  point  processes  in  continuous  time  can  be  described  equally 
well  through  the  structure  of  the  intervals  between  events   {X  },  where  the 
X.'s  are  continuous  and  positive-valued  random  variables,  or  the  counting 
process   {N(t)},  where  N(t)   gives  the  number  of  events  in   (0,t]   and  is 
discrete  and  non-negative.   We  discuss  the  modelling  of  the  intervals 
{X.}   first.   Of  course  the  applications  of  the  models  are  much  broader; 
the  X.'s  might  for  instance  be  the  magnitudes  of  successive  shocks  in 
a  sequence  of  earthquakes  or  the  successive  response  times  of  a  computer 
to  messages  sent  via  a  terminal. 

2.1.   The  first-order  autoregressive  exponential  model  (EAR(l)) 

In  a  Poisson  process  the  intervals   (X  }  are  independent  and  identically 

distributed  (i.i.d.)  random  variables  with  exponential  distribution 

Fv(x)  =  1  -  e"Xx,     X  >  0;    x  >  0  .  (2.1) 

Several  attempts  have  been  made  to  generalize  the  Poisson  process  by 
making  the  X.   dependent,  but  with  exponential  or  conditionally 
exponential  marginal  distributions  (Cox,  1955).   The  simplest  and  only 
really  successful  attempt  in  the  sense  of  broad  applicability  (Gaver 
and  Lewis,  1978)  gives  a  process  called  the  EAR(l)  model,  derived  from 
the  following  consideration. 

A  first-order  autoregressive  stochastic  sequence  is  defined  by  the 
stochastic  difference  equation  (a  special  case  of  (1.1)) 


Xi  =  pXi-l  +  €i'         i=°'  -1'  ~2,    '"''       lpl  <  L  '      (2'2) 


where  the  £   are  assumed  to  be  an  i.i.d.  stationary  random  sequence. 
If  the  £   are  normally  distributed,  so  are  the  X  .   What  must  the 
distribution  of  the  6,   be  in  order  for  the  X   sequence  to  be  stationary 
with  an  exponential (A)  distribution?   The  answer  is  surprisingly  easy 
(Gaver  and  Lewis,  1978). 

Let   0  <  p  <  1,  and  let   {E..}   be  an  i.i.d.  exponential  (A)  sequence. 
Now  let  £.   be  equal  to  zero  with  probability   p  and  equal  to   E 
with  probability  1-p.   Then  we  have 


(2.3) 


\  pxi-l 

probability  p  , 

xi  = 

(  pxi-l  +  El 

probability  (1-p)  , 

=  pX±_1  +  V±E±    ,  (2.4) 

where   { V . }   is  an  i.i.d.  binary  sequence  and  P{V  =0}  =  1  -  P{V  =1}  =  p. 
Moreover  if  we  let  X  =  E  ,  and  define  X   as  in  (2.3),  the  resulting 
sequence  is  stationary  for   1=0,1,  ...  . 

The  point  process  with  the  interval  structure  (4.3)  is  called  the 
EAR(l)  point  process.   It  is  a  tractable  model,  and  most  of  its  important 
properties  are  given  in  Gaver  and  Lewis  (1978).   In  particular  we  have 
that   p(k)  =  p  .   This  model  is  in  a  sense  degenerate  because  it  con- 
tains runs  of  X.   in  which  values  are  exactly   p  times  the  previous 
value;  it  could,  however,  be  a  reasonable  model  for  point  processes 
observed  in  computer  systems  (e.g.,  inter-arrival  times  of  requests  to 
a  storage  subsystem)  in  which  the  intervals  have  exponential  marginal 
distributions  but  are  dependent.   Note  that  as  defined  the  model  can  only 


provide  sequences   {X  }  with  positive  serial  correlations.   We  can, 
however,  define  the  process  to  include  negative  correlations  (Gaver  and 
Lewis,  1978);  there  is  also  a  way  to  obviate  the  degeneracy  (Lawrance, 
1978). 

Simple  generalizations  of  this  first-order,  autoregressive,  Markovian 
exponential  process  are  the  following. 

2.2.   The  moving  average  exponential  model  (EMA(q)). 

We  define  another  stationary  sequence   {X  },  using  the   {E  }   sequence 

above,  according  to 


XQ  =   EQ    ,  (2.5) 


X±  =    3E±  +  V±E±_V  i=l,    ...;        0  <   6  <  1    ,  (2.6) 


where   {U .  }  is  an  i.i.d.  binary  sequence  in  which  U  =  1  with  prob- 
ability  (1-3).   This  is  a  first  order  exponential  moving  average 
process  (EMA(l))  (Lawrance  and  Lewis,  1977)  which  is  one-dependent; 
in  particular 

p(l)  =  3(1-3)  (2.7) 

p(k)  =  0  ,     k=2,  3,  ...  .  (2.8) 

Properties  of  the  EMA(l)  process  are  given  by  Lawrance  and  Lewis  (1977). 
It  is  easy  to  see  that  we  can  make  E.  -   in  (2.6)  a  random  linear 
combination  of   E-_-i   and  E     to  get  an  EMA(2)  process,  and  can  con- 
tinue the  process  back  q   steps  to  obtain  an  EMA(q)  process.   The 
general  EMA(q)  model  takes  the  form 


3  E  w.p.   b     , 
q  i  q+1 

3  E.  +  3   ,E  w.p.   b   , 
q  l     q-1  i-1  r     q 

X.  -  <       (2.9) 


l 


Vi  +  VlEi-l  +  •'•  +  3lEi-q+l  W'P'   b2  ' 

3qEi  +  VlEi-l  +  ••*  +  3lEi-q+l  +  'i-q     W'P"   bl  ' 


for   0  <  aL,  32,  ...,  3  <1;   i=0,  +1,  +2,  ...   and 


3  i  =  q+1  , 

q 

b±  =  /  (1-3  )  ...  (l-3i)3i_1    q  >  i  >  2  ,  (2.10) 

(1-3  )  ...  (1-3.)       1  =  1. 
q       ! 

Note  that  the   3. 's  can  be  obtained  uniquely  from  the  b  's;  there  are 
q+1  b.'s  but  only  q   3's,  since  the  sum  of  the  b.'s   is  equal  to  one. 
This  model  is  clearly  only  q  dependent;  in  particular  the  correla- 
tions for  the  EMA(q)  process  are 


q-k+1 

J        bb,  l<k<q, 

(q)  1        v=l        VV+k  -       - 

p(k)  =   corr(Xi5   X±_k)   =  <  (2.11) 

0  q+1    <   k   <   °°      . 


Thus  the  serial  correlations  are  just  lagged  products  of  the  b   sequence 
and  the  formula  (2.11)  is  completely  analogous  to  the  formula  for  the 
serial  correlations  of  the  standard  MA(q)   process;  see  Box  and  Jenkins 
(1970,  p.  68).   It  can  be  seem  from  (2.11)  that  all  the  correlations  are 
nonnegative  and  it  may  be  further  shown  that  they  are  bounded  above  by 
1/4. 


2.3.   The  EARMA(1,1)  model. 

By  making  E     in  (2.9)  autoregressive  over  the  previous  E  's,  we  obtain 

a  mixed  qth  order  moving-average,  first  order  autoregressive  process  which 

we  denote  by  EARMA(l,q) .   Consider  explicitly  the  case   q=l.   The  first 

order  moving-average  and  first  order  autoregressive  process  EARMA(1,1) 

is  given  by 


with 


X±  =  3E±  +  UiAi_1  ,  (2.12) 


Ai-1  =  pAi-2  +  ViEi-l  '  (2'13) 


for   i=l,  2,  3,  ...   and  A_±   =  E_1  with  U   and  V.   as  defined  above. 
This  sequence  of  random  variables  is  not  Markovian. 

The  second-order  correlation  structure  of  the  process  is  given  by 

p(k)  =  pk_1  c(3,p)  ,  (2.14) 

where 

c(3,p)  =  3(1-3)  +  p(l-3)U-23)  .  (2.15) 

The  point  process  whose  intervals  have  the  EARMA(1,1)  structure  is  dis- 
cussed in  detail  in  Jacobs  and  Lewis  (1977).   In  particular,  for   3=1 
it  is  a  Poisson  process.   The  process  is  very  simple  to  generate  on  a 
computer  and  is  very  useful  for  modelling  dependent  sequences  in  queueing 
systems  (Jacobs,  1978;  Lewis  and  Shedler,  1978). 

2.4.   The  pth-order  autoregressive  model  EAR(p) . 

Quite  recently  ways  have  been  found  to  obtain  exponential  sequences   {X.} 
which  have  autoregressive  structure  of  order  p,  and  to  combine  these 
with  the  moving  average  process  to  get  a  mixed  autoregressive-moving 


average  process  EARMA(p,q);  see  Lewis  and  Lawrance  (1978).   Another  method 
of  defining  pth-order  autoregressive  exponential  sequences,  which  is 
closely  related  to  the  DARMA(p,q)  process  discussed  later,  and  which  we 
have  only  just  begun  to  explore,  is  described  here. 

This  pth-order  exponential  autoregressive  model  can  be  written  as 

Xi=V  Xi-S.  +€i>s<  •  (2-16> 

1     1       i 

where  the  S.'s  are  i.i.d.  discrete  random  variables  taking  values 

1,  2,  ....  p,  and  €.  _    is  defined  to  be  0  w.p.   a.,  and  E.   w.p. 

i,S  J        ! 

a.   if   S,  =  i .   If  one  assume  stationarity  and  that  X,  n,  X.  _,  are 
j       i   J  i-1   i-2 

marginally  exponential (X) ,  then  X.   is  a  random  mixture  of   E.   and 
X.-,  ...,  X.    and  is  exponent ial ( X) .   The  correlation  equations  from 
this  process  are  variants  of  the  familiar  Yule-Walker  equations.   The 
model  is  more  tractable  than  the  pth-order  autoregressive  process  given 
in  Lewis  and  Lawrance  (1978)  and  is  probably  simpler  to  extend  to  other 
distributions  than  the  exponential. 

A  drawback  of  these  EARMA-type  processes  is  that  the  serial  cor- 
relations are  all  positive,  although  the  scheme  given  in  Gaver  and  Lewis 
(1978)  for  a  negatively  correlated  EARMAl  process  can  probably  be 
extended  to  the  complete  EAPMA(p,q)  process. 

2.5.   The  semi-Markov  generated  point  process  with  fixed  marginal  dis- 
tribution. 
The  question  arises  as  to  whether  there  are  interval  processes   {X  }  with 
exponential  marginal  distributions  and,  for  example,  ARMA(1,1)  second- 
order  correlation  structure  and  which  cover  a  broader  range  of  correla- 
tion than  the  EARMA(1,1)  process  (though  perhaps  at  a  cost  of  more 
complicated  structure) . 


We  discuss  briefly  one  such  process.   It  is  a  special  case  of  the 
semi-Markov  generated  point  process  introduced  by  Cox  (1962)  and  extended 
by  Haskell  and  Lewis  (1978) .   We  first  describe  the  two-state  semi-Markov 
generated  model.   In  this  model  there  are  two  types  of  intervals  with 
distributions  F. (x)   and  F~(x),  sampled  in  accordance  with  a  two-state 
Markov  chain  for  which  the  one-step  transition  matrix  is 


(2.18) 


and  the  stationary  vector  is 

1  -  a         1  -  a 

n  =  np=i  - —  ,    ■= - —  |  .  (2.i9) 

2  -  a,  -  a2  '  2  -  a.  -  a. 

When  we  form  the  point  process  we  assume  that  no  information  is  available 
about  the  type  of  interval,  i.e.,  that  in  the  actual  bivariate  point  pro- 
cess of  transitions  we  suppress  knowledge  of  the  type  of  transition.   Then 
the  distribution  of  an  interval  between  transitions  (events)   X,   in  the 
stationary  point  process  is 


Fx(x)  =  7T1F1(x)  +  7T2F2(x)  (2.30) 

and  the  correlation  between  X.   and  XJM   is 

i        i+k 

p(k)  =  M  k  ,      k=l,  2,  ...  ,  (2.21) 


where  M  is  a  positive  constant  and   3  =  a.  +  a  -  1  =  a   (1  -  a  ) . 
Thus  the  correlation  structure  is  that  of  an  ARMA(1,1)  process.   For  a 
derivation  of  this  result  see  Cox  and  Lewis  (1966,  Ch.  7,  194-196) . 
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Lewis  and  Shedler  (1973)  used  this  process  to  model  the  page  exception 
process  in  a  multiprogrammed  computer  system.   The  problem  is  to  deal 
with  the  mixture  distribution  (2.20)  for  the  marginal  distribution  of 
intervals;  this  seams  to  limit  the  utility  of  the  model.   However,  there 
is  a  way  around  it  which  produces  a  marginally  controlled  semi-Markov 
generated  process. 

To  obtain  an  exponential  marginal  distribution,  consider  the 
following  device  (Jacobs  and  Lewis,  1977).   Fix  x  ,  where  0  <  x_  <  °°, 


and  let 


£   e"XUdu 


0  <  x  <  xn  , 


Fx(x)  = 


1  -  e 


-Ax0  -0 


=  S  %     e"Audu 


x  >xQ 


X  ±x0  > 


x  >  x„  ; 


(2.22) 


-Ax0  -0 

e 

then  F(x),  the  marginal  distribution  of  an  interval,  is  exponential (X) 
if  we  set  n     =  1  -  exp(-Xx  ).   There  is  one  degree  of  freedom  left  in 
the  matrix  I?;  in  addition  to   X,  we  have  free  parameters  t\        (or  x  ) 
and   a   although  the  range  of   a   is  restricted.   What  then  is  the  range 
of   3,  and  can  it  be  negative? 

Straightforward  manipulation  shows  that 

7T  -  a 

3=  ~ r±   ,  (2.23) 

1 
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which  lies  in  absolute  value  between  zero  and  one  but  can  be  negative; 
therefore  the  serial  correlations  can  be  negative.   Thus  the  model  appears 
to  be  broader  than  the  EARMA(1,1)  model.   The  question  of  comparing  the 
two  models  when  3  is  positive  has  not  yet  been  explored;  it  requires 
higher  order  interval  correlations,  as  discussed  by  Brillinger  (1972). 

2.6.   Generalizations 

The  marginally  controlled  semi-Markov  generated  sequence   {X.}  discussed 

above  can  be  extended  in  such  a  way  that  X.   will  have  any  distribution, 


say  F(x).   Thus  we  let 


F(x) 


0  <  x  <  xn  , 


Fx(x)  = 


F(xQ)  -  "0 


x  >xQ  ; 


(2.24) 


x  ±  x0  > 


F2(x)  = 


F(x)  -  F(xQ) 
1  -  F(xQ) 


x  >  x   ; 


then  the  marginal  distribution  of  an  interval  is  equal  to  F(x),  from 
(2.30),  if  we  set  tt  =  F(x_).   Note  that  the  model  is  very  non-linear 
and  the  correlation  structure  is  a  complicated  function  of  the  functional 
form  of   F(x) . 

The  much  simpler  EARMA  structure  can  be  extended  to  some  extent. 
Random  variables  for  which  the  equation  (2.2)  has  a  proper  solution  are 
called  self-decomposable  random  variables  on  random  variables  of  type  L. 
This  class  includes  random  variables  with  Gamma,  Cauchy,  Pareto,  double 
exponential  and  perhaps  many  other  distributions.   For  these  random 
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variables,  a  pth-order-autoregressive  process  can  be  defined  as  at  (2.16). 
The  unique  feature  of  the  exponential  process  is  that  the  £   which  makes 
X   exponential (X)  in  (2.2)  is  again  an  exponential (A)   random  variable, 
albeit  mixed  with  an  atom  at  zero.   This  property,  shared  with  the  double 
exponential  and  normal  random  variables,  is  what  makes  it  simple  to  define 
a  moving-average  type  process,  as  at  (2.9). 

3.    Count  Models:   Sequences  of  discrete-valued  random  variables. 

As  remarked  earlier,  most  data  on  point  processes  is  recorded  as 
numbers  of  events  in  successive  fixed-length  intervals.  Despite  this 
fact,  most  point  process  models  assume  that  exact  times  of  events  are 
known  and  it  is  not  simple  to  derive  from  these  models  the  statistics 
of  the  counts  in  fixed  intervals.  Thus  in  this  area  in  particular 
flexible  models  for  discrete-valued  random  variables  are  needed. 

Another  application  might  be  to  modelling  of  air  pollution  data  in 
which  concentrations  of  various  chemicals  in  the  air  is  indicated  on  a 
scale  of  zero  to  ten.   In  general  this  situation  requires  multivariate 
time  series,  but  space  prohibits  discussion  of  multivariate  versions  of 
the  DARMA-type  processes  discussed  in  this  section. 

3.1.   The  first-order  autoregressive  discrete  model  (DAR(l)). 

Again  we  denote  the  sequence  of  discrete-valued  random  variables  by 
{X  } .   If  the  X   are  counts  in  a  Poisson  process  then  the  X  's  are  i.i.d, 
Poisson-distributed  random  variables.   Once  dependence  is  observed  in 
data  it  is  useful  to  assume,  as  a  first  cut,  that  the  dependence  is 
Markovian  and  use  a  Markov  chain  model  in  which  the  distribution  of 
X     depends  only  on  the  value  of   X   and  is  specified  by  the  transi- 
tion matrix  P  with  elements 


13 


P(k,j)  =  P(Xi+1  =  j |X±  =  k}  ,  (3.1) 

with  j  and  k  taking  values  in  the  space  E,  a  discrete  subset  of  the 
real  line.  Under  suitable  conditions  there  is  a  stationary  distribution 
77  for   {X . }   given  by  the  equation 

J7  =  77  P_  .  (3.2) 

The  Markov  chain  model  (3.2)  is  by  virtue  of  its  place  in  the  stat- 
istician's toolbox  the  discrete  counterpart  of  the  ARl  process.   However 
the  ARl  process  has  one  dependency  parameter   P,  plus  any  parameters  which 
specify  the  distribution  of  the  G.'s.   The  Markov  chain  on  the  other  hand 
can  have  an  infinite  number  of  parameters  and  in  many  cases  77  cannot  be 
obtained  explicitly  from  (3.2).   This  is  awkward  for  statistical  analysis. 
A  solution  is  given  by  constructing  the  DAR(l)  model  (discrete  autoregressive 
model  of  under  one)  which  is  an  analog  of  the  EAR(l)  model,  as  follows. 

Let  Y.   be  an  i.i.d.  sequence  of  random  variables  taking  values  in 
the  space  E,  and  let  V,   be  an  i.i.d.  binomial  sequence  for  which 
P(V  =  l)  =  P.   Then 

X.  =  V,  X.  ,  +  (1  -  V.)YJ      i=0,  +1,  +2,  ...;   0  <  p  <  1  .    (3.3) 
i    i   l-l         x   i         ■  —   —  — 


X_L_1  w.p.   p  , 


Y±  w.p.   <l-p) 


(3.4) 


If  X   has  distribution  77,  then  so  does  X,  since  it  is  a  -  mixture  of  two 
random  variables,   X   and  Y  ,  with  distribution  77.   Consequently  all 
the  X.,  i=l,  2,  ...   have  marginal  distribution  77. 
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Note  that   {X  }   is  a  Markov  chain  with  transition  probabilities 


(3.5) 


in  fact  it  is  a  Markov  chain  in  which  the  correlation  structure  is  spe- 
cified by  one  parameter   p,  and  with  specified  marginal  (stationary) 
distribution  ir.   Thus  tt  may  be  a  Poisson  distribution  and  then  the 
DAR1  model  is  a  2-parameter   (A,p)   Markov  chain.   The  analogy  with  the 

AR(1)  model  is  clear 

As  with  the  EAR(l)  model  the  serial  correlations  are   p(k)  =  p   >_  0. 
Extensions  to  negatively  correlated  sequences  are  given  in  Jacobs  and 
Lewis  (1978). 

3.2.   The  pth-order  autoregressive  discrete  model  (DAR(p)). 
First  order  Markov  dependence  is  a  special  kind  of  dependence  which  is 
attractive  because  of  analytical  tractability  considerations,  but  it  is 
not  necessarily  met  with  in  practice.   One  immediate  consequence  of  the 
Markovian  property  is  that  runs  of  distinct  values,  say  X.  =  j,  have  a 
length  which  is  geometrically  distributed  (Jacobs  and  Lewis,  1978a)  and 
this  is  easily  checked  in  data.   If  the  data  fails  to  have  this  property, 
what  other  types  of  dependency  can  be  utilized? 

A  first  direction  might  be  to  go  to  higher  order  (say  pth-order) 
autoregression,  which  is  an  explicit  pth-order  Markov  structure,  and 
the  DAR(l)  model  can  be  extended  in  this  direction.   Thus  in  addition 
to  the  assumptions  at  (3.3)  let  A,   be  an  i.i.d.  sequence  of  random 
variables  taking  values  in   {l,  2,  ...,  p},  with  P{A.  =  j}  =  a..   Then 
the  DAR(p)  process  is  defined  as 
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(1  -  Vi)Y±  ,     i=0,  +1,  +2,  ...  (3.6) 


X.  -  V.  X.  .   + 


so  that  X    is  (exclusively)  either  one  of  the  previous   p  values   X   , , 

...,  X,   ,  or  the  error  term  Y..   Properties  of  this  model  are  developed 
'   i-p  l 

extensively  in  Jacobs  and  Lewis  (1978c).   When   ol  =  1,  and  all  other   a. 's 
are  zero  it  is  the  DAR(l)  model. 

Yule-Walker  equations  for  the  correlations  in  the  stationary  DAR(p) 
process  are  given  in  Jacobs  and  Lewis  (1973c)  as  well  as  stationarity 
conditions.   In  particular  for  p=2  we  have  the  limiting  result 

C     {l-P(l)}7T(k)TT(j)  k  j    j  , 

v(k,j)  =  lim  P{Xi+1=k,  X1+2=j}  =  <  (3.7) 

i_K°  (  p(lMj)  +  {l-p(l)>TT(j)2   k  =  j  , 

where   p(l)  =  corr(X  ,  X .,-.)   in  the  stationary  process.   Thus,  if  we  let 
X   and  X_1   have  the  joint  distribution  V(k,j),  a  stationary,  second- 
order  autoregressive  process  with  any  marginal  distribution  can  be 
generated.   A  scheme  for  obtaining  sequences  which  are  possibly  negatively 
correlated  is  given  in  Jacobs  and  Lewis  (1978c) . 

3.3.   The  q-th  order  moving  average  discrete  model  (DMA(q)). 
The  other  alternative  to  Markovian  dependence  (of  any  order)  which  is 
usually  considered  in  time  series  analysis  is  the  finite-length  dependence 
produced  by  the  moving-average  part  of  the  ARMA(p,q)  process  (1.1).   This 
type  of  behavior  is  easily  produced  for  discrete  random  variables  by  a 
random  index  model  of  the  type 


xi  -  Yi-Sl  -  <3-8> 
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where   S.   are  i.i.d.  random  variables  with  P {S .  <  k}  =  b,  .   Thus  we  may 

1  1  —       k  J 

write 


Xi  =  Yi-k      W'P*   bk  "  \-l  '  k=°'  '">    q;   b-l  =  °  *    (3'9) 


The  autoregressive  process  DAR(p)  is  also  a  random  index  model,  but  the 

random  indices  are  not  independent.   The  correlation  structure  of  this 

DMA(q)  process  is  easily  found  to  be 

(q)  q-k 

p(k)  -  corr(X  X    )  =   I   b  b  1  <  k  <  q  , 

v=0 

(3.10) 

=  0  k  >  q  . 

This  is  the  exact  analog  of  (2.11)  for  the  EMA(q)  process  and  the  cor- 
responding formula  for  the  MA(q)  process.   Note  that  the  DMA(q)  process 
is  not  Markovian.   Runs  properties  of  the  process  are  given  in  Jacobs 
and  Lewis  (1978a);  the  runs  are  not  geometrically  distributed. 

3.4.   Mixed  autoregressive-moving  average  discrete  models. 
As  in  the  case  of  the  ARMA(p,q)  model  (1.1),  it  is  useful  to  have  both 
autoregressive,  Markovian  dependence  and  moving  average  dependence  com- 
bined into  one  model.   In  Jacobs  and  Lewis  (1978a)  this  was  done  by 
replacing  the  Y     term  in  (3.8)  by  a  discrete  autoregression  (3.3) 

over  Y    ,  Y    ,,....   Clearly  this  can  be  extended  by  replacing 
l-q'   i-q-1 

Y.    by  a  p-th  order  autoregression  (3.6)  over  Y.   ,  Y,    , ,  ...   to 

i-q    J         r  i-q    i-q-1 

obtain  a  DARMA(p,q)  model  which  is  the  analog  of  the  EARMA(p,q)  model 
of  Lawrance  and  Lewis  (1978) .   This  is  not  a  complete  analog  of  the 
ARMA(p,q)  model  in  that  there  is  no  cross-over  of  the  autoregression  and 
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the  moving  average,  but  it  is  in  fact  possible  to  do  this  to  obtain  a 
model  called  NDARMA(p,q)  as  follows: 
Let 

Xi  =  Vi  Xi-A  +  (1  "  V  Yi-S.        i=°>  ±lj  ±2'  ••*  »       (3-13-> 

where  the  A,   are  i.i.d.  random  variables  taking  values  in   (l,  2,  ...,  p} 
with  P{A.=j }  =  a.;  the  S   are  i.i.d.  random  variables  taking  values  in 
{0,  ...,  q}  with  P{S  <   k}  =  F(k)   and  the  V  *s   are  i.i.d.  Bernoulli 
random  variables  with  P{V  =1}  =  p. 

The  model  works  because  a  mixture  of  dependent  random  variables,  all 
with  marginal  distribution  tt,  has  distribution  tt;  thus  if  X    ,  .  . . , 
X     have  marginal  distribution  tt,  then  so  will  X  since  it  is  a  mixture 

of  the  dependent  random  variables  X,  , ,  ....  X.    and  Y,   . ...  Y, 

r  i-1        i-p        i'     '   i-q 

Note  that  when   p=0  we  have  the  DMA(q)   process;  if  in  addition  F(0)  =  1 
the  sequence  is  i.i.d.  since  X.  =  Y..   When  1  >  p  4   0,  F(0)  =  1  we  have 
the  DAR(p)  process.   Thus  the  parameters  are  such  that  interesting  special 
cases  fall  out  easily.   Moreover  the   p  parameter  measures  the  degree  of 
mixture  of  Markovian  and  moving  average  dependence,  and  the  distributions 
of  the  A.'s  and   S.'s   give  a  picture  of  where  the  dependence  is  lagged 
over  previous  X.   or  Y   values. 

The  model  (3.11)  has  not  yet  been  fully  explored.   At  first  sight  it 
seems  preferable  to  the  .DARMA(p ,q)  model,  possibly  because  of  the  compact- 
ness of  (3.11)  and  its  close  analogy  to  ARMA(p,q)  models.   The  DARMA(p,q) 
and  NDARMA(p,q)  models  are,  however,  distinct  and  in  fact  preliminary 
investigation  of  the  (1.1)  case  shows  that  the  DARMA(1,1)  model  (Jacobs 
and  Lewis,  1978a)  has  a  broader  correlation  structure  than  does  the 
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NDARMA(1,1).   On  the  other  hand  the  autoregression  is  not  explicit  in 
the  DARMA(1,1)  model.   Both  models,  therefore,  will  probably  be  useful 
in  modelling  discrete  data  such  as  occur  in  sampled  point  processes. 

3.5.   The  marginally  controlled  semi-Markov  generated  process. 
In  the  structure  of  the  2-state  marginally  controlled  semi-Markov  generated 
process  detailed  at  (2.24)  no  assumption  was  made  about  continuity  of   F(x). 
Thus  F(x)   could  be  discrete,  giving  a  sequence   (x  }  with  known  discrete 
marginal  distribution  F(x)   and  ARMA(1,1)  correlation  structure.   By 
going  to  an  n-state  semi-Markov  model,  a  process  with  ARMA(p,q)  correla- 
tion structure  can  be  generated  (Haskell  and  Lewis,  1978)  with  n  a 
function  of  p  and  q,  and  the  procedure  to  obtain  a  given  marginal 
distribution  is  just  an  extension  of  (2.24).   Thus  we  have,  in  terms 
of  the  quantification  of  the  process  by  marginal  distribution  and 
correlation  structure,  a  direct  competitor  to  the  DARMA-type  processes. 

Comparison  of  the  two  types  of  discrete  processes  is  interesting  and 
points  up  the  simplicity  of  the  DARMA-type  processes.   In  particular  the 
correlation  structure  of  the  DARMA(p,q)  process  is  explicit  in  form  if 
not  in  detail  and  the  process  is  a  simple,  random  linear  combination  of 
random  variables  generated  from  an  i.i.d.  sequence  Y , .   This  is  clearly 
not  true  for  the  marginally  controlled  semi-Markov  generated  process; 
the  recognition  that  its  correlation  structure  is  ARMA-type  is  accidental 
and  not  intuitive.   Deeper  comparison  of  these  processes  in  terms,  say, 
of  the  range  of  correlation  the  model  will  encompass  will  be  instructive. 
Here  again  the  DARMA-type  processes  have  an  advantage;  their  correlation 
structure  is  independent  of  the  marginal  distribution  tt. 
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4 .    Summary  and  Conclusions 

We  have  outlined  in  this  paper  three  models  for  discrete-valued  and 
positive-valued  time  series,  all  of  which  to  some  degree  satisfy  the 
criteria  of  flexibility  or  simplicity  or  both  set  forth  in  the  intro- 
duction.  Perhaps  the  main  point  about  the  models  is  that  they  are 
designed  to  accomodate  situations  in  which  the  marginal  distributions 
in  the  stationary  processes  are  given  and  are  non-normal. 

Properties  of  these  models  such  as  mixing  and  asymptotic  results, 
higher-order  moments,  distributions  of  runs  for  the  discrete  models  and 
sums  of  random  variables  and  point  spectra  are  considered  in  the 
references. 

There  are  many  other  properties  of  the  processes  which  are  still 
to  be  explored.   Statistical  estimation,  except  in  an  ad  hoc  manner  and 
for'  the  Markovian  cases,  is  difficult  and  has  yet  to  be  examined. 
Extensions  to  multivariate  cases  is  of  great  interest  for  real  applica- 
tions and  has  been  done  to  some  degree  in  the  context  of  queues  with 
correlated  service  and  arrival  times  (Jacobs,  1978,  and  Lewis  and  Shedler, 
1978).   The  DARMA-type  processes,  in  particular,  can  be  easily  extended 
to  coupled  equations  in  the  same  way  as  linear  processes  are  extended  in 
econometric  models.   They  might  therefore  find  use  in  modelling  multi- 
variate situations  such  as  the  number  of  cars  passing  different  points 
in  a  road  evaluated  in  successive  fixed  time  intervals. 

Finally  an  important  problem  is  to  extend  the  models  so  as  to  include 
inhomogeneity ,  particularly  of  the  seasonal  type,  and  the  effects  of  con- 
comittant or  auxilliary  variables.   Several  schemes  are  under  consideration 
for  these  extensions  of  the  models. 
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